Pooled Genomic Indexing (PGI): Mathematical Analysis and Experiment Design
نویسندگان
چکیده
Pooled Genomic Indexing (PGI) is a novel method for physical mapping of clones onto known macromolecular sequences. PGI is carried out by pooling arrayed clones, generating shotgun sequence reads from pools and by comparing the reads against a reference sequence. If two reads from two different pools match the reference sequence at a close distance, they are both assigned (deconvoluted) to the clone at the intersection of the two pools and the clone is mapped onto the region of the reference sequence between the two matches. A probabilistic model for PGI is developed, and several pooling schemes are designed and analyzed. The probabilistic model and the pooling schemes are validated in simulated experiments where 625 rat BAC clones and 207 mouse BAC clones are mapped onto homologous human sequence.
منابع مشابه
Pooled Genomic Indexing (PGI): Analysis and Design of Experiments
Pooled Genomic Indexing (PGI) is a novel method for physical mapping of clones onto known sequences. PGI is carried out by pooling arrayed clones and generating shotgun sequence reads from the pools. The shotgun sequences are compared to a reference sequence. In the simplest case, clones are placed on an array and are pooled by rows and columns. If a shotgun sequence from a row pool and another...
متن کاملA chronic fatigue syndrome – related proteome in human cerebrospinal fluid
BACKGROUND Chronic Fatigue Syndrome (CFS), Persian Gulf War Illness (PGI), and fibromyalgia are overlapping symptom complexes without objective markers or known pathophysiology. Neurological dysfunction is common. We assessed cerebrospinal fluid to find proteins that were differentially expressed in this CFS-spectrum of illnesses compared to control subjects. METHODS Cerebrospinal fluid speci...
متن کاملLayout-based substitution tree indexing and retrieval for mathematical expressions
We introduce a new system for layout-based (LTEX) indexing and retrieving mathematical expressions using substitution trees. Substitution trees can efficiently store and find expressions based on the similarity of their symbols, symbol layout, sub-expressions and size. We describe our novel design and some of our contributions to the substitution tree indexing and retrieval algorithms. We provi...
متن کاملDesign and Statistical Analysis of Pooled Next Generation Sequencing for Rare Variants
Next generation sequencing NGS is a revolutionary technology for biomedical research. One highly cost-efficient application of NGS is to detect disease association based on pooled DNA samples. However, several key issues need to be addressed for pooled NGS. One of them is the high sequencing error rate and its high variability across genomic positions and experiment runs, which, if not well con...
متن کاملComputational methods for high-throughput pooled genetic experiments
Advances in high-throughput DNA sequencing have created new avenues of attack for classical genetics problems. This thesis develops and applies principled methods for analyzing DNA sequencing data from multiple pools of individual genomes. Theoretical expectations under several genetic models are used to inform specific experimental designs and guide the allocation of experimental resources. A ...
متن کامل